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Abstract 


The paper considers Wasserstein metric between the empirical probability measure of n discrete 
random variables and a continuous uniform one on the d-dimensional ball and give the asymptotic 
estimation of their expectation as n — oo. Further We considers the above problem on a mixed process, 
i.e., n discrete random variables are produced by the Poisson process. 
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1 Introduction 


Article studied Ollivier curvature of random geometric graphs, a key step in which is building up the 
estimation of Wasserstein metric between the empirical probability measure of n discrete random variables 
and a continuous uniform one on the d-dimensional ball and authors applied results in 5], but which 
are building on [0,1] versus Ollivier curvature is built on balls, so the proof process in |1| seems tedious 
and vulnerable. Another hand, Lattice method in statistical mechanical approach a often involve similar 
notation and convergence from a discrete physical quantity to a continuous one, which should exist close 
links with the convergence from a discrete probability to a continuous one. All inspire us to study approach 
problem between the discrete probability and the continuous one, in this article we choose the Wasserstein 
metric as the approach measure, by which we hope to build a bridge in studying the relation between a 
discrete quantity to a continuous one in more scenes. 


2 Preliminary Estimation 


Definiton 1. Let X1, X2,--- , Xn, Yi, X2,--- , Yn be independent uniformly distributed random variables on 
the d-dimensional B(0;1) = {x € R?,||x|| < 1},d > 2, where ||- || is a norm on Rt. The random variable 
Mé := inf X; lX: - Y,(i)|| denotes the optimal matching between X1,X2,-++,Xn and Y1,Y2,-** , Yn, 


where o takes over all permutations of {1,2,--- ,n}. 


By dual principle [2\[3], one has 


where the set of Lipschitz functions % = {f : B(0;1) > R;|f(x) — f(y)| < |la— yl], Vz, y € B(0; 1), f(0) = 
0}. Notice every Lipschitz function in “ may extend to one in 


Z = {f : R? > R; |f) — fU) < lle yll Yz, y € R?, £0) =0, I fllz~ < 1}, (1) 


so 4, = L|p(0;1). The following Lemma [1] gives a upper and lower bound estimation for the expectation 

E(M?2). 

Lemma 1 (optimal-matching). For the above optimal matching, one has for the dimension d > 2 

E(Mn) E(Mn) 

ee ans 
-4 


< lim sup : 
noc nia 


< 2d +5 + 8V2. 


Proof. we put the proof in appendix. The method is essentially from 5], we make some improvement and 
modification for extending it to random variables on balls. 


3 Main results and proofs 


The following are results on the Wasserstein metric between empirical and uniform measures on the d- 
dimensional ball. In general, the Wasserstein metric between two probability 41, u2 is given by the following 
definition. 


Definiton 2. Let pı and uo be Borel probability measures on a compact metric space (X,d) and let 
T(u1, u2) denote the set of joint probability u between pı and u2, whose marginals are u and uz respectively. 
The Wasserstein metric is defined by 


Wina _ int / Jeania. 
HED (mh) J EXX 


By duality principle (Kantorovich Dual Theorem) [, Wasserstein metric can be expressed as 
Woasse)= su (f foame- f fodi), 
JEZ(X) VA X 


where A(X) denotes the set of Lipschitz functions about metric on X with the coefficient 1. From the 
above duality formula, we may further assume A(X) with the additional condition that is any f € A(X) 
satisfying f(0) = 0. 


Notice: In below all metrics are induced by some norms. To keep uniform math notation with most 
original articles, we still use ’d’ expressed the metric in a space that shouldn’t arise confusion with the 
dimension notation. 


Theorem 1. Let X1, X2,--- , Xn be independent uniformly distributed random variables on the d-dimensional 
ball B(0;1), let m? denote the empirical measure 


1 n 
ms (y) = n 5 ly=x; 
j=l 


and u? the uniform measure on B(0;1). Then as n —> 00, 


B[W4 (mt, p°)] = O(n-*),d > 2. 
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Proof. 
Wms, n°) = inf f d(x, y)du(x, y) 
B(0;1) x B(0;1) 


wer (nd pd) 


a ( f fla)dm4 (x) — | ru), 
fEL,(B(0;1)) B(0;1) B(0;1) 


Let Yi, Yo,--- , Yn be independent uniformly distributed random variables on B(0; 1), then 
f f(y)du4(y) = EJE i= 1, pn. 
B(0;1) 


So 
Wem, u?) = sup ( | f(a)dmé (x) — f Fon) 
fEL,(B(0;1)) B(0;1) B(0;1) 


Jh ap (Su - srw) 


= S 
N feLZ,(B(O;1)) 


1 sup (= E [f(X:) = roxa) 


N feL,(B(0;1)) 


1 
=— sup El) (f(Xi) — f(%))|X 
N fe Li (B(0;1)) i=1 


1 
Sa” Xi) — f(¥i)) ) |X 
a: (cP St j- »)| | 
= +5 [Mx], 

n 


hence from lemma [I} one has 


BWA (mi, p°] < BIg] = O(n?) 


Corollary 1. Generally, let X1, X2,- -> , Xn be independent uniformly distributed random variables on the 
d-dimensional ball B(0,6) with radius 6 > 0, and m2 denote the empirical measure 


1 
ms (y) = n 5 ly=x;, 
{=l 


and u? the uniform measure on B(0,6). Then 


Proof. Let y = ôt and denote 


mi (t) = dim (5t) = FEY tarx, AIE) = ptt), 
w ict 


202112.00122v1 


chinaXiv 


then m¢,f% are empirical and uniform measures respectively about variable t on B(0;1). Especially one 
has 

W'(mg nt) = inf Í d(x, y)u(x, y)dzdy 
wel (me uw") J B(0,8)x B(0,8) 


inf f d(ôt, ôT)ulôt, ôr)?” dtdr 
uET(m4 ut) J B(0;1)x B(0;1) (4) 


inf l ôd(t, r) f(t, T)dtdr 
pel(ms, fi) J B(0;1)x B(0;1) l vat 


= W° (ma, B°). 


So 
E(W4(mé, w*)| = 8O(n™3),d > 2. 


Theorem 2. Consider a Poisson process P with intensity measure (1 + On) nage oT on d-dimensional ball 


B(0;1), for some sequence 0 < an > 0. Let m% denote the empirical random measure with respect to P, 
1.€. 


1 
moly) = [Z] 5 ly=p; 


pEP 


and u? the uniform measure on the above ball. Then 
E[W (mS, u*)| = O(n-2). 


Proof. Since the size of random variables about my is produced by mixed random process #, so 


EW (mh, 2)] = >) BIW (mS, w)||F| = kIP(Po((1 + an)n) = k). 
k=1 


From Theorem 1, we know 
1 


B[W (mp, u")||P| = k] = O(k-4). 
Another hand, from lemma 1.2 in |4| one has 


(ltan)n 


cy/ (1+a,)n log n4 lta, )n+e,/(1+an)nlogn) lo 
P(Po((1+an)n) > (1+an)n+eV/(1+an)nlogn) <e PAE (¢ ene ) Ë Utanntey tanner 


c2? (1+an)n logn 
2 


<e AGtennteVaten mer) O(n F) 


(5) 


where c is a constant determined later, and 


N 


Ff (1tan)n 
) <e c4/ (14 an)nlogn+((1+an)n-cy/(1Fan)nlogn) log Urano tann on OnE). 
(6) 


P(Po((1+an)n) < (1t+an)n—eV/(1 + an)nlogn 


Denote 


a = [(1 + an)n cvy (1+ an)nlogn]. 
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Then 


EW (mS, u*)] = D E [W (m, w")||F| = k] P(Po((1 + an)n) = k) 
k=1 


+ 5 E [W(m%, u| P| = k| P(Po((1 + an)n) = k) 


k=a,, 


+ So E[W(mS, u%)||F| =k] P(Po((1 + an)n) = k) 


k=at +1 
— l + Tp + Íz. 
Hence 
aa ~l 2 
h= SD B[W(my, 2)||P| = k] P(Po((1 + an)n) = k) < 2P(Po((1 + an)n) < az) = O(n) 
k=1 
and 
I = 5 E [W(m%, u| P| = k| P(Po((1+an)n) = k) < 2P(Po((1 + an)n) > at) = O(n- =). 
k=at +1 
The last part I2, one has 
an 
b= Y E[W(m4, u’) |P] = k] P(Po((1 + an)n) = k) 
k=a, 
an —(1+an) k 
= Y on 4° (Q + an)n) 
k! 
k=ay, 


<0 (1 ta,)n—e/(14 ann logn)~*) = O(n-2). 


Finally one has for some adjusted c 
EW (mS, u°)] = O(n?) 


Corollary 2. Let 0 < a, > 0,x € R? and consider a Poisson process P with intensity measure (1 + 
an)n č on the d-dimensional ball B(x; 6) with the radius 5 > 0. Let mt denote the empirical measure 


dV, 
+ * [BD] l 
with respect to P, i.e. 


1 
m(y) z 12| 5 ly=p, 


pE P 


and p the uniform measure on B(x; ô). Then 


E[W4 (m4, u2)] = O(n-2). 


Proof. As the proof in Theorem [2 First it has the size |7| a Poisson distribution on B(x; ô) with mean 
value (1+ a,)nd¢. So 


1 


B[W4 (m$, p2) = EW° (m8, 22) = SEW ° (mt, p2)] = 80( (n8?) =?) = O(n). 


Conclusion Corollary [2] may be directly applied to get the Appendix A.3 in i]. Next we hope to apply 
our method to analyze electrostatic approach problem. 


A The lower bound of E(M?) 


Since M? = inf, 07, ||Xi — Yoo l| > info OL, minj<n |X: — Yj|]| = CPL, minj<n || X: — Y;||, one has 
d > ; O y > : ; _y. 
E(Mg|X) > 2 PERIK: Y;||X) > nmin F(min |e — Yj). 


Let B(x,t) = {y € R° : ||x-yl| < t}, then BESO < min{t?, 1} and P(minj<n ||z—Y;|| >) > (1-t4)" 
for t < 1. Thus 


1/d 


E nua | (1 — ut /n)"du. 
0 


1/d 


love) 1 _ 
Emine- Y= f Planin je - ¥;l) > Oat > f 0- era "= 
ign 0 j<n 0 


Finally by Fatou lemma, one has 


E(M¢ = 1 
tim int > f a 
0 


n=>œ niai 


B The upper bound of E(M¢) 
Set r = + sor —> 0 as n > œ and |B(z,r)| = wa, where wa = |B(0; 1)|. Set 


aja fh EIK-Yse 
UI) =] 0, otherwise 


Set 
ya = BENA BODI 
IBD]? 
then b(x) < 4 < 1 if n > 1. First decomposed icn (f (Xi) — f(¥:)) as follows 
DGD- =A A -A N iD Oa- i) -A a- utia), 


so one has 


DEX) — FD) SIDS PK) DF ul) — SOF) DGD FDO FDA - Sa) - SFA - Sued)! 


i<n i<n jcn j<n i<n i<n jan jn i<n 
= h +h. 
(7) 
Further we will run two parts estimation of E suppe y, Jı and Fsuprey, I2 and finally get 
E(M,) = E sup DECA Y| < E or I +E sup h <n a + 2dn'7 44n +8V2n?, 
fEeLy i<n fEeLy 


hence one has 
E(M, n) < 


1 
ni-a 


2d +448V2nt-2 = 5 + 2d + 8V2n17 2, 


B.1 The estimation of Fsupyey, hi 


=| 0 F(%) oud) — FM) DVD =I DY eG AFH) - FH) 8 r» 2 uli). 


i<n j<n j<n i<n i<n j<n 


Since f is Lipschitz, 
E sup h < rE Ð Buli) =r DD E E Ui XD =r E BOM) r EE ant. 
fELi i<n j<n iln j<n i<nj<n i m 


Notice the estimation of J; may further optimize, refer to [5]. 
B.2 The estimation of E suppea 12 
Decomposed I> as follows 


< [YO F(X) A- nb(Xi)) -YO FY) A — lY) 


i<n jgn 


:= I1 + Ino + In3. 


+ {DU F(X) fae = Ea] 


j<n 


We will run the estimation for these three parts and finally get 


E sup Ig < E sup Io, + E sup 22 + E sup Í23 < 2dn!7 a + 4n! + 8V/2n?. 
fezi JEZ fez fEZ 


B.2.1 The estimation of Esuprey, Igy. 


Notice || f||z-2(B(0;1)) < 1 and the value of b(X;) on B(0; 1), one has 


E sup (Io 10% )(1 — nb(X; I) < E(X It- rx) =S E(IL- rax) < ndr = dn" 3, 


i<n i<n i<n 


+155 Y) [mo -$ uli, j) 


so we can get 


E sup hı = E sup a (1 — nb(X;)) — Sf) (A = ndly, DI) 


feLr JEZ i<n j<n 


Š E sup |X f(X: \(1— nb(X DIFE np] ZA \(1 — nb(Y;))| 
JEZ i<n x j<n 
< 2dn!7 3, 


B.2.2 The estimation of E suppe y, 122, Esupre y, 123. 


This part estimation is difficult and one may use the convolution decomposition to impose f on small areas, 
then there will has the below estimation 


E sup Ing = E o Ing = E sup ( So FG ;)(nb(X;) — L «ti.0)!) < 27 + 4V2n?. 
JEZ fEeLy i<n jn 
First we assume f an indicator function on some set A, and estimate E ( icn La(Xi)(nb(Xi)—D jen uli, 5) P) . 


So one has 


E ( NO 1a(Xi)(nb(Xi) — Sou (i, j)) A e( 5 5 14(X i) — u(i, j)) La (Xr) (bX: ) uer) . 


i<n j<n 4,0! <n jj an 


Discussing as different cases of 7,2’, 7,7’, there has 


2 2 Al 1 2 M| 1 |A| 
A < 1 <2 s 
a0 22 Cea) )s#e-Ugene* "Bates" ze © 

Second i consider aa Let g is bounded Lipschitz in R?, h a bounded support function in R4, 
and g x h(a = fran 9 h(a — t)dt to estimate the expectation of 

ae yt X) - uli, 3))| = | fa t) 32 So A(X — t) (oX) — uli, j)) a. 

i<n isn i<nj<cn 
That is 
B(| ax hex) D O- ule) 

i<n j<n 
-E(f OZE- 90%) ~ uia) 
et i<n j<n 


< |lgllz= Æ [2 0009 — u(i, j)) |dt 
RA 


< lal ( A LE a (f n f, 1eme toata) 
B(0;1)2n J R” 22, B(0;1)2n J R” 


= (supp(h glx ( f £ DDD Ve) -ui D) Pa). 


i<n j<n 


Further set h(a) = col a(x), there is by formula 
fz DIDDLE (i, j)) |?) at = f f IX Y co14(Xi—t)(b(X:)-u(i, j)) | odødt < 2ncå| Al, 
i<ng<n ™ J B(0;1)?” i<ngj<n 


hence one has the below estimation of g convolution with characteristic function 


(| Y g * (cola)(Xi) X @(X:) -u )l) < |A|?e9(2|Aln)? |lgllz= = V2co|A|n? |igllz=  (9) 


i<n j<n 


Finally We decompose f into a sum of some well-defined convolutions to estimate these parts. Since a 
Lipschitz function f, in B(0;1) C R? with f(0) = 0, may extend to a whole Lipschitz function on R? with 
IIfllL= <1. So we may consider function f € Z and decompose f = yo fi, where 


4 


TE |B(0;1)|-1(2'r)-¢ x € B(0,2r) 
i) 0 otherwise 


fi = f—f*hai, f2 = fxhi-fxhoxhi,-: f= Fah 1*° -exhi — f * hix hi1 *: xhai, fata = f * hqg: ha, 
d 
and q denoted by 29t! = 29t1r > 1 > 2p = ga So 


q+1 gtl 
E sup Im = E (eID Acs IX bas uti) sYe( ap ZAO (K)-ate I) 


For first part, one has 


BDA D (4) — wl )) < DI- ftal B(| (0) -ui |) < nar 


i<n j<n i<n j<n 


where we omit the computing process of (| Z jen (O(X:) — u(i, j)) P) < 1, further (| Don (6G) - 


utia) <1. 


For the expectation about fı = (f — f * hi) * hı x --- x hi_-y with 2 < l < q, one has from formula (9) 


BID A) D OK 


i<n j<cn 


ui.) < V2|B(0;1)/-*(2' 47) 4] supp (a1) |n? Il f—fhu) eh «ial nee < V2n22!r 
For the last part expectation about fg41 = f * hı * +- hq—1 * hq, one has by (9) again 


B(| S- fat1(Xi) > (6(Xi) - ulid))!) < V2|B(0; 1)|-1(24r)~*|supp(hq) |r? || f * hı * ++- Aga < VZn229ttr. 


i<n j<n 


Summing all part estimations, one has 


E sup z2 < n2r + (V2n22?r ++ V2n221+1r) < oni-a + 4y 2n? 
JEZ 
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